Início » Tecnologia & Inovação » Webinars Técnico » 2021
08/07 - 11:00 am
Denny Lee
Abstract ⓘ
Abstract
×Speaker
Denny Lee
Title
Bringing Reliability to your Data Lake with Apache Spark and Delta Lake (Databricks)
Abstract
Apache Spark has become the de-facto open-source standard for big data processing for its ease of use and performance. The open-source Delta Lake project improves Spark’s data reliability, with new capabilities like ACID transactions, Schema Enforcement, and Time Travel. This helps to ensure that data lakes and data pipelines can deliver high-quality and reliable data to downstream data teams for successful data analytics and machine learning projects. Join us to learn how Apache Spark 3.0 and Delta Lake enhance Data Lake reliability.
24/06 - 09:00 am
Dalya Baron
Abstract ⓘ
Abstract
×Speaker
Dalya Baron
Title
Machine learning in astronomy: past, present, and future (Tel Aviv University)
Abstract
Machine Learning and Deep Learning have revolutionized many domains, and have sparked a burst of interest in astronomy as well. Astronomical datasets, being large, rich, and consisting of the beautifully complex ingredients of our Universe, seem to offer the perfect testbed for such tools. Past, ongoing, and future surveys have provided or are expected to provide multi-color and multi-temporal observations of hundreds of millions of stars and galaxies in our Universe. The application of Machine Learning tools to these datasets may offer new and exciting opportunities, but it also raises several questions. Will Machine Learning revolutionize the field of astronomy as well? Will it have a dramatic impact on our data processing pipelines, on the models we deduce from the data, or on the types of scientific questions that we ask? In this talk I will review different types of Machine Learning algorithms and will present some example applications of these tools to astronomical datasets. I will describe the current state of the field, focusing on the challenges we face. I will finish by proposing an approach that might help us harness the full potential of these tools in the future.
10/06 - 02:00 pm
Robert Nikutta
Abstract ⓘ
Abstract
×Speaker
Robert Nikutta
Title
Astro Data Lab - An open-access and open-data science platform (NOIRLab)
Abstract
The Astro Data Lab (https://datalab.noirlab.edu), or Data Lab for short, is an astronomical science platform developed at NOIRLab. It is open and free to all who are interested in astronomy, data science, and education efforts. Launched 4 years ago to enable remote access and analysis of survey data products generated by NOAO\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\'s telescopes, such as the Dark Energy Survey, Data Lab now serves almost 100 TB of catalog data from all major NOIRLab-operated facilities, plus high-value external datasets (e.g. Gaia, unWISE/AllWISE, SDSS, and others). It also provides access to over 2 PB of images. As a science platform Data Lab combines big data holdings with a number of very useful data services, and with the powerful idea of remote analysis through a Jupyter notebook server. Data Lab user accounts come with a generous allocation of remote storage for files and personal databases. In this seminar I will introduce the concepts that underlie a science platform, and highlight the data services Data Lab is able to offer thanks to the co-location of big data and compute capacities. I will then discuss some of our current new developments, for instance to serve data products from massively-multiplexed spectroscopic surveys such as SDSS and DESI. Finally I would also like to discuss with everybody some of the challenges and pportunities for making science platforms more unified in terms of user experience, and more interoperable from the operators\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\' perspective. Time permitting, I can show a brief live demo of Data Lab.
11/03 - 11:00 am
Sunil Mucesh
Abstract ⓘ
Abstract
×Speaker
Sunil Mucesh
Title
A machine learning approach to galaxy properties: Joint redshift-stellar mass PDFs with Random Forest (University College London)
Abstract
Point estimates of galaxy properties determined with a small number of photometric bands are imprecise. To fully characterise uncertainties in the estimates, accurate probability distribution functions (PDFs) are required. These PDFs must also reflect the correlations between different quantities of interest. Traditionally, such PDFs are derived by fitting model spectra to photometric data. However, this approach quickly becomes impractical for fitting modern datasets, where sample numbers can exceed hundreds of millions. In this talk, I present a novel method based on the Random Forest (RF) machine-learning (ML) algorithm to generate accurate joint redshift-stellar mass PDFs. I discuss different techniques used to validate both the marginal and joint PDFs. Finally, I demonstrate GALPRO, a Python package capable of producing multivariate PDFs of galaxy properties on-the-fly at incredible speeds, and discuss some of its applications.
04/03 - 01:00 pm
Leanne Guy
Abstract ⓘ
Abstract
×Speaker
Leanne Guy
Title
Opportunities for Early Science with Rubin/LSST (Vera Rubin Observatory)
Abstract
Starting in 2023, the Vera C. Rubin Observatory will spend its first 10 years conducting the Legacy Survey of Space and Time (LSST). LSST will observe the entire visible southern sky and provide the widest, fastest and deepest view of the night sky ever observed. The resulting astronomical archive will be vast; 500PB of image data products and a 15PB final catalog of ~ 40 billion Objects. LSST will dramatically advance our knowledge in many fields including dark energy and dark matter, as well as galaxy formation and potentially hazardous asteroids. In this talk I give an overview of the Rubin Observatory and the Legacy Survey of Space and Time and talk about opportunities for early science and how to get involved.